Overview

NAVO has started regularly querying some TAP and Cone Search services to collect data on their response times. So far this is mostly NAVO services, but also includes a CDS 2MASS cone search for comparison. (Some Chandra Source Catalog queries are also done, but due to sparse sky coverage these need to be adjusted.)

The queries are done using the servicemon application (https://servicemon.readthedocs.io/en/latest/), and are executed from several different locations. The AWS instrumentation is handled with the software at https://github.com/NASA-NAVO/AWS_servicemon. The results are written to a TAP-accessible database currently running at IPAC.

Colaborating

Now that all can examine the monitoring data and run additional tests, all can contribute:

Known Issues/Action Items

Short term:

Longer term:

What Tests Are Run

All of the parameters of the queries are configurable, but below is what is currently running. TAP queries now are all async.

Services

base_name service_type
2MASS_STScI cone
CDS_2MASS cone
CSC cone
CSC tap
HEASARC_swiftmastr cone
HEASARC_swiftmastr tap
HEASARC_xmmssc cone
HEASARC_xmmssc tap
IPAC_2MASS cone
IPAC_2MASS tap
IPAC_WISE cone
IPAC_WISE tap
NED cone
NED tap
PanSTARRS tap
PanSTARRS xcone
STScI_ObsTAP tap
WISE_ST cone

When and what cones?

A set of 10 random cone queries, with radii ranging from 0 to 0.25 degrees, is run for each service every 6 hours. The exact hours are staggered by location.

We should change this to include (or only use) fixed cones, so that we can compare the exact same queries over time. (servicemon can be run with fixed or random targets.)

From Where

The queries are run from the following AWS regions:

'ap-northeast-1', 'ap-southeast-2', 'eu-west-3', 'sa-east-1', 'us-east-1','us-west-2'

Due to testing, the database also contains results from other locations like:

'ip-172-31-36-250.us-west-2.compute.internal', 'ip-172-31-43-179.ap-northeast-1.compute.internal', 'kvmexodev.ipac.caltech.edu'

Result Data Available via TAP

The TAP service at http://navo01.ipac.caltech.edu/TAP has a table called navostats with one row per query run by servicemon.

Note: The VOSI endpoints have not yet been implemented for this service, so PyVO and Topcat will complain during metadata gathering, but both both PyVO and Topcat can be used to query this service, and all the TAP_SCHEMA tables are implemented, so those can be used to query metadata.

The following columns are available:

Query Description

Query Input

column_name datatype format description
ra double 20.6f Right Ascension of the query cone region.
dec double 20.6f Declination of the query cone region.
sr double 20.6f Radius of the query cone region (deg).
adql char 300s For TAP queries this is the full ADQL query that was done. Empty for non-TAP queries.

Other Query Metadata

column_name datatype format description
access_url char 300s The base URL of the service.
base_name char 20s A short name of the service given by the servicemon configuration files. Not yet consistent for all services.
service_type char 20s While other values are possible, the main service types we're tracking now are tap, cone, and xcone which is like cone, but not VO-compliant.
location char 80s Self-declared location of the monitoring service (e.g., AWS region).
start_time char 30s The data and time that the query was started (format='%Y-%m-%d %H:%M:%S.%f').
end_time char 30s The data and time that the query was completed (format='%Y-%m-%d %H:%M:%S.%f').

Query Results

Note that these values may empty for certain types of query failures.

Timing

column_name datatype format description
int0_desc char 20s Description of the first interval measured (int0).
int0_duration double 20.6f The duration of the first interval measured. So far this is the time to an HTTP response indicating that the query is complete, but prior to the results being streamed back to the client.
int1_desc char 20s Description of the second interval measured (int1).
int1_duration double 20.6f The duration of the second interval measured. So far this is the time to download the the results after the HTTP response indicating that the query was complete.
int2_desc char 20s Description of the third interval measured (int1=2).
int1=2_duration double 20.6f For cone and sync tap queries, this is the total duration from when the query was issued until result download was complete. For async TAP, this is the amount of time it took to submit the async query.

Result metadata

column_name datatype format description
num_columns integer 9d Number of FIELDs in the result VOTable.
num_rows integer 9d Number of rows in the result VOTable.
size integer 10d Size of the result VOTable (bytes).

Querying and Plotting the Data

Imports

This code requires an environment that includes servicemon, bokeh and pandas.

Sample Plotting Functions

These functions support converting our query results to pandas then plotting some sample plots using bokeh, both in a notebook and on a web page.

Create Data Source for Bokeh

Support tooltips for plotted points.

Plot Query Duration versus Start Time

Plot Query and Download Duration versus Number of Result Rows

Create Plots for all Services

Loop through stats for multiple services and display the plots in the notebook or on an html page.

Example: Use Above Methods

More Plot Ideas

Sample plot differentiating location by shape

Try out the location plots